State-Correlated Duration Model for HMM-Based Speech Synthesis System1
نویسندگان
چکیده
This paper proposes a State-Correlated Duration model for HMM-based speech synthesis system. It uses an improved forward-backward algorithm to estimate the state-duration transition probability between the neighboring states. In the synthesis part, we determine the state duration taking account of the state-duration transition probability. Experiment results show that the speech we synthesized using the new duration model has a higher quality.
منابع مشابه
Explicit duration modelling in HMM-based speech synthesis using a hybrid hidden Markov model-multilayer perceptron
In HMM-based speech synthesis, it is important to correctly model duration because it has a significant effect on the perceptual quality of speech, such as rhythm. For this reason, hidden semi-Markov model (HSMM) is commonly used to explicitly model duration instead of using the implicit state duration model of HMM through its transition probabilities. The cost of using HSMM to improve duration...
متن کاملSimultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
In this paper, we describe an HMM-based speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM. In the system, pitch and state duration are modeled by multi-space probability distribution HMMs and multi-dimensional Gaussian distributions, respectively. The distributions for spectral parameter, pitch parameter and the state du...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملHidden semi-Markov model based speech synthesis
In the present paper, a hidden-semi Markov model (HSMM) based speech synthesis system is proposed. In a hidden Markov model (HMM) based speech synthesis system which we have proposed, rhythm and tempo are controlled by state duration probability distributions modeled by single Gaussian distributions. To synthesis speech, it constructs a sentence HMM corresponding to an arbitralily given text an...
متن کاملDuration modeling for HMM-based speech synthesis
This paper proposes a new approach to state duration modeling for HMM-based speech synthesis. A set of state durations of each phoneme HMM is modeled by a multi-dimensional Gaussian distribution, and duration models are clustered using a decision tree based context clustering technique. In the synthesis stage, state durations are determined by using the state duration models. In this paper, we ...
متن کامل